Maximum likelihood estimation

Definition

Maximum likelihood estimation (MLE) is a popular mechanism which is used to estimate the model parameters of a regression model.
In other words, it always first compute the probability("how likely") of a specific data point given a distribution/model, and then seeks for the distribution/model parameters that can maximize the total probabilities of all data points.

Actually, what is likelihood?
A likelihood function is numerically equal to a conditional probability, but is always a function of the variable after the “|” sign.

Implementation

I put MLE applications into two categories:

Examples

Distributions

Normal distribution

  1. formulate PDF of the distribution P(xi|θ)
P(x_i|\theta) = \mathcal{N}(x_i,\mu,\sigma) = \frac{1}{\sqrt {2 \pi }\sigma} e^{ - \frac{(x-\mu)^2}{2\sigma { #2} }}
  1. compute the product of the likelihood of all data points L(θ)
L(θ)=L(μ,σ)=P(X|μ,σ)=inN(xi,μ,σ)

Binomial distribution

  1. formulate PDF of the distribution P(xi|θ)
pi=b(i;n,p)=(ni)pi(1p)ni
  1. compute the product of the likelihood of all data points L(θ)
L(θ)=L(p)=P(X|p)=in(ni)pi(1p)ni

Regressions

Linear regression

  1. formulate the conditional probability P(yi|xi,theta)
    for a linear regression, we have:
y^=f(x)=θx+η

e.g., if the noise idd. a normal distribution: ηN(0,σ2)
then the y idd. also a normal distribution: ηN(θx,σ2)

P(yi|xi,theta)=12πσ2e12σ2(yiθxi)2
  1. compute the product of the likelihood of all y values L(θ)
L(θ)=P(Y|X,θ)=in12πσ2e12σ2(yiθxi)2

Logistic regression (here using categorical data)

  1. formulate the conditional probability P(yi|xi,theta)
    for a logitstic function representing categorical data ("A" or "B"), we have probability of "A":
y^=f(x)=11+eb(x+c)P(yi=category A|xi,theta)=y^iP(yi=category B|xi,theta)=1y^i
  1. compute the product of the likelihood of all y values L(θ)
L(θ)=P(Y|X,theta)=iny^yi(1y^)1yi